#INTRODUCTION AND MODEL CONSTRUCTION

This analysis consists of taking electricity consumption data from 1st of January, 2016 till the 20th of May, 2021 to build and compare alternative forecasting approaches.To make series as stationary as possible,we transform data(weekly,monthly etc.).As final step,We forecast with our model and report the difference with real life data.

Now,we take data from (https://seffaflik.epias.com.tr/transparency/tuketim/gerceklesen-tuketim/gercek-zamanlituketim.xhtml) and manipulate columnames suitable.

## Loading required package: lubridate
## 
## Attaching package: 'lubridate'
## The following objects are masked from 'package:base':
## 
##     date, intersect, setdiff, union
## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo
## Loading required package: data.table
## 
## Attaching package: 'data.table'
## The following objects are masked from 'package:lubridate':
## 
##     hour, isoweek, mday, minute, month, quarter, second, wday, week,
##     yday, year

We add datetime column (mixed column of Hour and Date) and check autocorrelation of Consumption.

acf(el_cons$Consumption)

As first approach,We shall show autocorrelation of hourly data and plot time series of data.

Time series plot seemed a little unreadable.Random part of data gives us more insight about data.Then,decompose time series of data and plot it.

Actually,that still does not make sense with these plots.I prefer getting autocorrelation of detrended data and going on with daily model.

Thanks to dplyr and zoo libraries, we transform our hourly data to daily data.After that step,let us observe autocorrelation of data and plot of time series of data,which should be more readable than before.

## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:data.table':
## 
##     between, first, last
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
## 
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric

As we did for hourly data,that is time for data to decompose and plot of that.

Even now,We got better observations of data.Then,we may examine autocorrelation of detrended data.

That table is better than before but still there are some peaks.

Improvement on our way urges me to study with weekly data.The same steps for that ought to be applied.So,after manipulating data for weekly examination,we plot time series of data and decomposition of data.

Plot of random part has the best looking ,so far.We happily take detrended data and display autocorrelation of that.

According to the table,there might be correlation with successive data.Autoregressive part is expected in model we build.

Next observation we make is monthly data.Manipulate and display time series of monthly data below.

That model seems ignoring to me.However,that is better to have look at plot of decomposition and autocorrelation of detrended data anyway.

Although autocorrelation table seems better,random term of weekly comnsumption seems better off. I will ve looking for pattern at every 7 days or multiples of that.

Let us start with building AR models.P parameter are 1,2,3(to be sure of pattern),7,14,21,28 respectively.

## 
## Call:
## arima(x = detrend_daily, order = c(1, 0, 0))
## 
## Coefficients:
##          ar1  intercept
##       0.5150     0.0053
## s.e.  0.0195     0.0924
## 
## sigma^2 estimated as 3.89:  log likelihood = -4064.32,  aic = 8134.65
## [1] 8134.65
## [1] 8151.357
## 
## Call:
## arima(x = detrend_daily, order = c(2, 0, 0))
## 
## Coefficients:
##          ar1      ar2  intercept
##       0.6207  -0.2055     0.0054
## s.e.  0.0222   0.0223     0.0750
## 
## sigma^2 estimated as 3.726:  log likelihood = -4022.61,  aic = 8053.22
## [1] 8053.222
## [1] 8075.498
## 
## Call:
## arima(x = detrend_daily, order = c(3, 0, 0))
## 
## Coefficients:
##          ar1      ar2      ar3  intercept
##       0.6056  -0.1598  -0.0735     0.0054
## s.e.  0.0227   0.0263   0.0227     0.0697
## 
## sigma^2 estimated as 3.706:  log likelihood = -4017.37,  aic = 8044.75
## [1] 8044.749
## [1] 8072.594
## 
## Call:
## arima(x = detrend_daily, order = c(7, 0, 0))
## 
## Coefficients:
##          ar1      ar2     ar3      ar4      ar5      ar6     ar7  intercept
##       0.5139  -0.0941  0.0158  -0.0551  -0.0953  -0.0241  0.4000     0.0096
## s.e.  0.0208   0.0239  0.0239   0.0239   0.0239   0.0239  0.0208     0.1143
## 
## sigma^2 estimated as 2.925:  log likelihood = -3789.11,  aic = 7596.23
## [1] 7596.226
## [1] 7646.346
## 
## Call:
## arima(x = detrend_daily, order = c(14, 0, 0))
## 
## Coefficients:
##          ar1      ar2     ar3      ar4      ar5      ar6     ar7      ar8
##       0.8950  -0.2517  0.0465  -0.1563  -0.0062  -0.1511  0.8148  -0.8482
## s.e.  0.0227   0.0305  0.0309   0.0309   0.0310   0.0309  0.0242   0.0242
##          ar9     ar10    ar11    ar12    ar13     ar14  intercept
##       0.1645  -0.1117  0.0706  -0.081  0.0641  -0.0133     0.0044
## s.e.  0.0308   0.0310  0.0309   0.031  0.0306   0.0228     0.0441
## 
## sigma^2 estimated as 1.196:  log likelihood = -2926.61,  aic = 5885.22
## [1] 5885.224
## 
## Call:
## arima(x = detrend_daily, order = c(21, 0, 0))
## 
## Coefficients:
##          ar1      ar2     ar3      ar4      ar5      ar6     ar7      ar8
##       0.9319  -0.2826  0.0450  -0.1473  -0.0237  -0.0999  0.4898  -0.4879
## s.e.  0.0227   0.0310  0.0317   0.0317   0.0318   0.0317  0.0299   0.0308
##          ar9     ar10     ar11     ar12     ar13    ar14     ar15    ar16
##       0.0372  -0.0766  -0.0086  -0.0222  -0.0679  0.3610  -0.4744  0.1407
## s.e.  0.0327   0.0327   0.0328   0.0327   0.0328  0.0309   0.0300  0.0318
##          ar17    ar18     ar19    ar20     ar21  intercept
##       -0.0639  0.0460  -0.0632  0.0554  -0.0264     0.0035
## s.e.   0.0319  0.0318   0.0318  0.0312   0.0228     0.0311
## 
## sigma^2 estimated as 1.015:  log likelihood = -2768.58,  aic = 5583.15
## [1] 5583.153
## 
## Call:
## arima(x = detrend_daily, order = c(28, 0, 0))
## 
## Coefficients:
##          ar1      ar2     ar3      ar4      ar5      ar6     ar7      ar8
##       0.9339  -0.2761  0.0417  -0.1450  -0.0430  -0.0591  0.3475  -0.3656
## s.e.  0.0227   0.0311  0.0317   0.0317   0.0319   0.0319  0.0310   0.0313
##          ar9     ar10     ar11     ar12     ar13    ar14     ar15    ar16
##       0.0031  -0.0691  -0.0198  -0.0442  -0.0559  0.2132  -0.3095  0.0949
## s.e.  0.0324   0.0323   0.0324   0.0324   0.0324  0.0317   0.0317  0.0324
##          ar17     ar18     ar19     ar20    ar21     ar22    ar23     ar24
##       -0.0548  -0.0156  -0.0444  -0.0378  0.2826  -0.3416  0.0441  -0.0452
## s.e.   0.0325   0.0325   0.0325   0.0325  0.0314   0.0310  0.0320   0.0320
##         ar25     ar26    ar27     ar28  intercept
##       0.0427  -0.0021  0.0122  -0.0254     0.0028
## s.e.  0.0318   0.0318  0.0312   0.0228     0.0233
## 
## sigma^2 estimated as 0.9154:  log likelihood = -2669.85,  aic = 5399.7
## [1] 5399.704

With this AIC and BIC values ,we find second auto regressive model best when p value is 28.AIC value of that model is 5399.704,which is slightly better than that of first model and much better than that of third model.

Implement the procedure for MA.Q values are from 1 to 3 by 1 and from 7 to 28 by 7 respectively.

## 
## Call:
## arima(x = detrend_daily, order = c(0, 0, 1))
## 
## Coefficients:
##          ma1  intercept
##       0.5670     0.0054
## s.e.  0.0222     0.0698
## 
## sigma^2 estimated as 3.843:  log likelihood = -4052.47,  aic = 8110.93
## [1] 8110.934
## [1] 8127.641
## 
## Call:
## arima(x = detrend_daily, order = c(0, 0, 2))
## 
## Coefficients:
##          ma1     ma2  intercept
##       0.6234  0.2265     0.0054
## s.e.  0.0244  0.0281     0.0813
## 
## sigma^2 estimated as 3.742:  log likelihood = -4026.67,  aic = 8061.34
## [1] 8061.339
## [1] 8083.615
## 
## Call:
## arima(x = detrend_daily, order = c(0, 0, 3))
## 
## Coefficients:
##          ma1     ma2      ma3  intercept
##       0.6907  0.2781  -0.3487     0.0054
## s.e.  0.0208  0.0251   0.0199     0.0681
## 
## sigma^2 estimated as 3.425:  log likelihood = -3941.5,  aic = 7893
## [1] 7892.998
## [1] 7920.842
## 
## Call:
## arima(x = detrend_daily, order = c(0, 0, 7))
## 
## Coefficients:
##          ma1     ma2     ma3      ma4      ma5      ma6      ma7  intercept
##       0.7445  0.3417  0.0235  -0.2794  -0.5522  -0.7742  -0.1079     0.0018
## s.e.  0.0225  0.0257  0.0273   0.0319   0.0302   0.0321   0.0287     0.0138
## 
## sigma^2 estimated as 2.3:  log likelihood = -3557.26,  aic = 7132.52
## [1] 7132.519
## [1] 7182.639
## 
## Call:
## arima(x = detrend_daily, order = c(0, 0, 14))
## 
## Coefficients:
##          ma1    ma2     ma3      ma4      ma5      ma6     ma7     ma8      ma9
##       0.7300  0.342  0.0930  -0.1456  -0.3514  -0.4778  0.2117  0.0381  -0.1638
## s.e.  0.0237  0.028  0.0275   0.0302   0.0284   0.0295  0.0311  0.0318   0.0328
##          ma10     ma11     ma12     ma13    ma14  intercept
##       -0.2665  -0.3315  -0.3626  -0.3861  0.0708      2e-04
## s.e.   0.0323   0.0293   0.0245   0.0423  0.0389      9e-04
## 
## sigma^2 estimated as 1.698:  log likelihood = -3266.02,  aic = 6564.03
## [1] 6564.032
## [1] 6653.134
## 
## Call:
## arima(x = detrend_daily, order = c(0, 0, 21))
## 
## Coefficients:
##          ma1     ma2     ma3      ma4      ma5      ma6     ma7     ma8
##       0.7692  0.3947  0.1281  -0.1436  -0.3772  -0.5065  0.1233  0.0058
## s.e.  0.0238  0.0296  0.0297   0.0319   0.0330   0.0391  0.0395  0.0357
##           ma9     ma10     ma11     ma12     ma13    ma14    ma15    ma16
##       -0.1171  -0.2335  -0.3200  -0.4610  -0.4446  0.1584  0.0504  0.0795
## s.e.   0.0345   0.0320   0.0297   0.0289   0.0378  0.0373  0.0335  0.0378
##         ma17    ma18     ma19     ma20    ma21  intercept
##       0.0656  0.0260  -0.1374  -0.1980  0.1379      1e-04
## s.e.  0.0407  0.0373   0.0286   0.0362  0.0315      9e-04
## 
## sigma^2 estimated as 1.474:  log likelihood = -3129.7,  aic = 6305.39
## [1] 6305.39
## [1] 6433.475
## 
## Call:
## arima(x = detrend_daily, order = c(0, 0, 28))
## 
## Coefficients:
##          ma1     ma2     ma3      ma4      ma5      ma6     ma7     ma8
##       0.8031  0.3946  0.1093  -0.1766  -0.4005  -0.5339  0.1064  0.0270
## s.e.  0.0257  0.0311  0.0381   0.0341   0.0398   0.0134  0.0237  0.0389
##           ma9     ma10     ma11     ma12     ma13    ma14    ma15    ma16
##       -0.1385  -0.2491  -0.3590  -0.4846  -0.4813  0.1576  0.1024  0.0924
## s.e.   0.0291   0.0364   0.0298   0.0397      NaN     NaN  0.0459  0.0220
##         ma17    ma18     ma19     ma20    ma21    ma22    ma23    ma24    ma25
##       0.0984  0.0303  -0.1199  -0.2178  0.1801  0.0491  0.0021  0.0137  0.0104
## s.e.  0.0285  0.0160   0.0455      NaN     NaN  0.0530  0.0264  0.0341  0.0209
##         ma26     ma27    ma28  intercept
##       0.0045  -0.0370  0.0193      1e-04
## s.e.  0.0330   0.0269  0.0181      8e-04
## 
## sigma^2 estimated as 1.423:  log likelihood = -3095.93,  aic = 6251.86
## [1] 6251.864
## [1] 6418.931

When q parameter is 28 ,that gives the best AIC value in all of them.That is actually better than value of moving average model,which is 6251.86.

We shall build autoregressive (AR) and moving average (MA) model based on our previous p and q values that are 28 and 28. Go for (p,q)=(28,28)

## 
## Call:
## arima(x = detrend_daily, order = c(28, 0, 28))
## 
## Coefficients:
##          ar1      ar2     ar3      ar4     ar5     ar6      ar7     ar8
##       0.6522  -0.1385  0.1864  -0.0178  0.0932  0.0907  -0.5712  0.6065
## s.e.     NaN   0.1376     NaN      NaN  0.0735     NaN      NaN  0.0185
##           ar9    ar10    ar11     ar12     ar13    ar14     ar15    ar16
##       -0.3533  0.0579  0.0079  -0.0722  -0.1659  0.6066  -0.5399  0.2713
## s.e.   0.0627  0.1095  0.1072      NaN   0.1093  0.0402      NaN  0.1220
##          ar17     ar18    ar19    ar20    ar21     ar22    ar23    ar24    ar25
##       -0.0241  -0.0095  0.0043  0.0481  0.8194  -0.7366  0.2039  -0.237  0.0025
## s.e.   0.1094   0.0575  0.0856  0.1027     NaN      NaN  0.0946     NaN  0.1371
##          ar26    ar27    ar28     ma1     ma2      ma3      ma4      ma5
##       -0.0415  0.0111  0.1273  0.2829  0.1253  -0.0966  -0.2398  -0.3652
## s.e.      NaN  0.0904     NaN     NaN     NaN   0.0254      NaN   0.0235
##           ma6     ma7      ma8      ma9     ma10     ma11     ma12    ma13
##       -0.4636  0.2879  -0.2842  -0.0463  -0.0258  -0.1097  -0.0143  0.1553
## s.e.      NaN  0.0565      NaN   0.0619      NaN      NaN   0.0477  0.0377
##          ma14    ma15     ma16     ma17     ma18     ma19     ma20     ma21
##       -0.4219  0.0621  -0.0633  -0.1003  -0.0183  -0.0321  -0.0326  -0.8960
## s.e.      NaN  0.0325   0.0308   0.0530      NaN   0.0748      NaN   0.0256
##          ma22    ma23    ma24   ma25    ma26    ma27    ma28  intercept
##       -0.0608  0.0041  0.2155  0.335  0.3669  0.3532  0.0834      0e+00
## s.e.      NaN  0.0260     NaN    NaN  0.0598     NaN  0.0390      2e-04
## 
## sigma^2 estimated as 0.7105:  log likelihood = -2441.39,  aic = 4998.79

Increase q by 7.(p,q)=(2,35)

## 
## Call:
## arima(x = detrend_daily, order = c(2, 0, 35))
## 
## Coefficients:
##          ar1      ar2      ma1     ma2     ma3     ma4      ma5      ma6
##       1.2402  -0.9898  -0.4837  0.4497  0.4780  0.1795  -0.0433  -0.2185
## s.e.  0.0050   0.0065   0.0272  0.0303  0.0302  0.0261   0.0236   0.0271
##          ma7      ma8      ma9     ma10     ma11     ma12     ma13    ma14
##       0.4165  -0.6018  -0.1595  -0.0526  -0.1441  -0.2597  -0.4068  0.3703
## s.e.  0.0290   0.0350   0.0422   0.0395   0.0374   0.0314   0.0362  0.0359
##          ma15     ma16    ma17     ma18     ma19     ma20    ma21     ma22
##       -0.4962  -0.0879  0.0753  -0.0239  -0.0567  -0.2744  0.4944  -0.2951
## s.e.   0.0486   0.0531  0.0508   0.0519   0.0422   0.0441  0.0331   0.0430
##          ma23    ma24     ma25    ma26     ma27    ma28     ma29     ma30
##       -0.0759  0.1214  -0.0058  0.0099  -0.2722  0.3793  -0.1809  -0.0446
## s.e.   0.0477  0.0469   0.0518  0.0466   0.0454  0.0304   0.0345   0.0345
##         ma31    ma32    ma33     ma34    ma35  intercept
##       0.0784  0.0712  0.0813  -0.1672  0.1877     0.0004
## s.e.  0.0383  0.0384  0.0364   0.0358  0.0314     0.0011
## 
## sigma^2 estimated as 1.134:  log likelihood = -2876.5,  aic = 5831

(p,q)=(2,21)

## 
## Call:
## arima(x = detrend_daily, order = c(2, 0, 21))
## 
## Coefficients:
##          ar1     ar2     ma1      ma2      ma3      ma4      ma5      ma6
##       0.5944  -0.074  0.1756  -0.0148  -0.0624  -0.1751  -0.2484  -0.3020
## s.e.  0.0861   0.078  0.0831   0.0804   0.0583   0.0381   0.0309   0.0302
##          ma7      ma8      ma9     ma10     ma11     ma12     ma13    ma14
##       0.3909  -0.1035  -0.1748  -0.2047  -0.1964  -0.2667  -0.1949  0.3873
## s.e.  0.0464   0.0450   0.0264   0.0290   0.0351   0.0343   0.0380  0.0398
##          ma15     ma16     ma17     ma18     ma19     ma20    ma21  intercept
##       -0.0865  -0.0173  -0.0342  -0.0133  -0.1042  -0.0813  0.3265      1e-04
## s.e.   0.0405   0.0275   0.0277   0.0232   0.0256   0.0273  0.0235      7e-04
## 
## sigma^2 estimated as 1.454:  log likelihood = -3117.15,  aic = 6284.3

(p,q)=(2,42)

## 
## Call:
## arima(x = detrend_daily, order = c(2, 0, 42))
## 
## Coefficients:
##          ar1     ar2     ma1     ma2      ma3      ma4      ma5     ma6     ma7
##       0.4594  0.0552  0.3779  0.0085  -0.0663  -0.1940  -0.2903  -0.343  0.2315
## s.e.  0.1989  0.1322  0.1978  0.1984   0.1341   0.0714   0.0468   0.061  0.1028
##           ma8      ma9     ma10     ma11     ma12     ma13    ma14    ma15
##       -0.0328  -0.1840  -0.2050  -0.2259  -0.2635  -0.2588  0.3087  0.0123
## s.e.   0.0676   0.0327   0.0537   0.0728   0.0818   0.0935  0.1080  0.0689
##          ma16    ma17    ma18     ma19     ma20    ma21    ma22     ma23
##       -0.0411  0.0043  0.0168  -0.0814  -0.1299  0.4255  0.0469  -0.0402
## s.e.   0.0359  0.0369  0.0337   0.0380   0.0462  0.0428  0.0717   0.0637
##          ma24     ma25     ma26     ma27    ma28    ma29     ma30     ma31
##       -0.0147  -0.0413  -0.0706  -0.0940  0.4005  0.0484  -0.0303  -0.0694
## s.e.   0.0449   0.0327   0.0395   0.0463  0.0391  0.0638   0.0572   0.0447
##          ma32     ma33     ma34    ma35    ma36     ma37     ma38     ma39
##       -0.0794  -0.1049  -0.1095  0.2658  0.0004  -0.0225  -0.0328  -0.0836
## s.e.   0.0346   0.0444   0.0451  0.0422  0.0497   0.0458   0.0408   0.0291
##          ma40     ma41    ma42  intercept
##       -0.1611  -0.0895  0.2119      1e-04
## s.e.   0.0359   0.0447  0.0484      8e-04
## 
## sigma^2 estimated as 1.146:  log likelihood = -2888.57,  aic = 5869.13

(p,q)=(2,14)

## 
## Call:
## arima(x = detrend_daily, order = c(2, 0, 14))
## 
## Coefficients:
##          ar1      ar2     ma1     ma2      ma3      ma4      ma5      ma6
##       0.5722  -0.1239  0.1612  0.0215  -0.0300  -0.1617  -0.2444  -0.3347
## s.e.  0.1859   0.0836  0.1845  0.1314   0.0696   0.0374   0.0380   0.0486
##          ma7      ma8      ma9     ma10     ma11     ma12     ma13    ma14
##       0.4192  -0.1193  -0.2015  -0.1988  -0.1895  -0.1714  -0.2336  0.2831
## s.e.  0.0807   0.0485   0.0262   0.0470   0.0603   0.0558   0.0610  0.0677
##       intercept
##           1e-04
## s.e.      8e-04
## 
## sigma^2 estimated as 1.689:  log likelihood = -3261.07,  aic = 6558.14

Struggling for finding best model is kind of moving parameters up and down to get better AIC values.That leads us to model,p and q values of which are 28 and 28 respectively.AIC value of that is 4998.79.

#FORECASTING

All the work we have done is now used to forecast data that already got to test how good our model is. Y(t)-Ŷ(t)=residuals We now have Y(t) and residuals.Simple math calculations lead us to Ŷ values.

After modeling for random part and electricity consumption part,We display the plot of random part with points of our model on it

and plot of consumption itself.

We forecast the 14 days from 6th of May 2021 and 20th of May 2021 although we already had real data in order to see how good our model is. We build model with our ARIMA model and turn that into time series data.Trend value is from last trend value of daily decomposition and seasonal parameter is got with the same way.

In spite of not having understanding of instantaneous decline,our forecast seems nice.For statistical data ,we shall use WMPA of our forecast.

## Time Series:
## Start = c(66, 3) 
## End = c(66, 17) 
## Frequency = 30 
##  [1] 12.105934 11.901525 11.635454 10.622005 11.292832 11.352290  9.907331
##  [8]  8.473307  8.799297  9.181065  9.533440 11.413106 12.030311 11.673463
## [15] 12.247834

According to absolute percentage of error for each date,our error is under 15 percentage error for all days,which proves that our model is good enough.